Skip to content

[Prototype] ScyllaDB#3422

Open
Kbhat1 wants to merge 8 commits into
mainfrom
mvcc-scylla
Open

[Prototype] ScyllaDB#3422
Kbhat1 wants to merge 8 commits into
mainfrom
mvcc-scylla

Conversation

@Kbhat1
Copy link
Copy Markdown
Contributor

@Kbhat1 Kbhat1 commented May 12, 2026

Describe your changes and provide context

Testing performed to validate your change


Note

High Risk
High risk because it changes the state-store construction and introduces optional network-backed fallback reads for pruned historical versions, plus new Kafka→Scylla ingestion code paths that can affect correctness and operational reliability when enabled.

Overview
Adds a prototype ScyllaDB/Cassandra-backed historical state offload.

Nodes can now be configured (new state-store.historical-offload-scylla-* flags and TOML fields) to wrap the primary SS with a historical.FallbackStateStore that falls back to Scylla/Cassandra for pruned point reads (Get/Has) while keeping iteration/writes on local SS, with an in-memory LRU cache for repeated lookups.

Introduces a new Kafka consumer binary (historical-scylla-consumer) plus Scylla schema and sink implementation to ingest SS changelog entries into state_mutations/state_versions, and exports offload.NewSASLMechanism so the consumer shares the producer’s Kafka auth configuration. Also adds the gocql dependency (and related indirect deps) and updates tests to cover the new config and sink/reader behavior.

Reviewed by Cursor Bugbot for commit 4a14131. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 13, 2026, 8:08 PM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 35.76538% with 449 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.14%. Comparing base (654d40b) to head (4a14131).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
sei-db/state_db/ss/offload/consumer/consumer.go 0.00% 177 Missing ⚠️
sei-db/state_db/ss/offload/historical/scylla.go 49.29% 71 Missing and 1 partial ⚠️
sei-db/state_db/ss/offload/consumer/scylla.go 66.47% 48 Missing and 11 partials ⚠️
sei-db/state_db/ss/offload/historical/store.go 39.34% 32 Missing and 5 partials ⚠️
sei-db/state_db/ss/offload/consumer/kafka.go 41.37% 33 Missing and 1 partial ⚠️
sei-db/state_db/ss/store.go 19.35% 24 Missing and 1 partial ⚠️
sei-db/state_db/ss/offload/consumer/config.go 0.00% 24 Missing ⚠️
...ad/consumer/cmd/historical-scylla-consumer/main.go 0.00% 21 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3422      +/-   ##
==========================================
- Coverage   59.25%   59.14%   -0.12%     
==========================================
  Files        2110     2117       +7     
  Lines      174181   175478    +1297     
==========================================
+ Hits       103210   103779     +569     
- Misses      62044    62724     +680     
- Partials     8927     8975      +48     
Flag Coverage Δ
sei-chain-pr 48.47% <35.76%> (?)
sei-db 70.62% <ø> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
app/seidb.go 74.39% <100.00%> (+2.39%) ⬆️
sei-db/config/ss_config.go 100.00% <ø> (ø)
sei-db/state_db/ss/offload/kafka.go 59.37% <100.00%> (ø)
...ad/consumer/cmd/historical-scylla-consumer/main.go 0.00% <0.00%> (ø)
sei-db/state_db/ss/offload/consumer/config.go 0.00% <0.00%> (ø)
sei-db/state_db/ss/store.go 21.87% <19.35%> (-78.13%) ⬇️
sei-db/state_db/ss/offload/consumer/kafka.go 41.37% <41.37%> (ø)
sei-db/state_db/ss/offload/historical/store.go 39.34% <39.34%> (ø)
sei-db/state_db/ss/offload/consumer/scylla.go 66.47% <66.47%> (ø)
sei-db/state_db/ss/offload/historical/scylla.go 49.29% <49.29%> (ø)
... and 1 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Kbhat1 Kbhat1 marked this pull request as ready for review May 13, 2026 18:00

func scyllaHistoricalOffloadConfigured(cfg config.StateStoreConfig) bool {
return strings.TrimSpace(cfg.HistoricalOffloadScyllaHosts) != "" ||
strings.TrimSpace(cfg.HistoricalOffloadScyllaKeyspace) != ""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial Scylla config crashes node at startup

Medium Severity

scyllaHistoricalOffloadConfigured uses || (OR), so setting only HistoricalOffloadScyllaKeyspace without hosts (or vice versa) makes the function return true. The code then attempts to open a Scylla session with incomplete config, which fails validation and crashes node startup. Using && would be safer since both hosts and keyspace are required for a valid connection — a partial config would simply be treated as "not configured" rather than causing a startup failure.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5ef70ba. Configure here.

Comment thread sei-db/state_db/ss/offload/consumer/scylla.go
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 4a14131. Configure here.

}
}
return g.Wait()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal errgroup cancellation misidentified as external cancellation

Medium Severity

In writeRecordsPipelined, all record-row goroutines share gctx from an errgroup. When a later record's write fails, the errgroup cancels gctx, causing earlier in-flight goroutines to fail with context.Canceled. Since rowDone channels are read in order, the main loop surfaces the earlier record's cancellation error. writeBatchWithRetry then calls isCancellation on this error, treats it as a real cancellation, and skips retries. workerLoop also sees the cancellation and silently returns nil, killing the worker. The fetcher eventually blocks trying to send to the dead worker's full shard channel, stalling the entire consumer.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4a14131. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant